Constructing a Tagged E-J Parallel Corpus for Assisting Japanese Software Engineers in Writing English Abstracts

نویسنده

  • Masumi Narita
چکیده

This paper presents how we constructed a tagged E-J parallel corpus of sample abstracts, which is the core language resource for our English abstract writing tool, the “Abstract Helper.” This writing tool is aimed at helping Japanese software engineers be more productive in writing by providing them with good models of English abstracts. We collected 539 English abstracts from technical journals/proceedings and prepared their Japanese translations. After analyzing the rhetorical structure of these sample abstracts, we tagged each sample abstract with both an abstract type and an organizational-scheme type. We also tagged each sample sentence with a sentence role and one or more verb complementation patterns. We also show that our tagged E-J parallel corpus of sample abstracts can be effectively used for providing users with both discourse-level guidance and sentence-level assistance. Finally, we discuss the outlook for further development of the “Abstract Helper.”

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Web-based English Abstract Writing Tool Using a Tagged E-J Parallel Corpus

In this paper, we present a Web-based English abstract writing tool, the “BEAR (Building English Abstracts by Ricoh).” This English writing tool is aimed at helping Japanese software engineers improve the organization of their writing by enabling them to select a rhetorical template of the target abstract and to build up component sentences while having access to good-quality sample sentences. ...

متن کامل

Cross-disciplinary use of Organizational Linkers in Research Article Abstracts

Abstract This study focuses on realizations and discourse functions of the organizational linkers in the writing of research article abstracts from four disciplines. To this end, 120 research article abstracts from four disciplines namely, Applied Linguistics, Economics, Agriculture, and Applied Physics (30 from each discipline) were selected. All research article abstracts were extracted from ...

متن کامل

Cross-disciplinary use of Organizational Linkers in Research Article Abstracts

Abstract This study focuses on realizations and discourse functions of the organizational linkers in the writing of research article abstracts from four disciplines. To this end, 120 research article abstracts from four disciplines namely, Applied Linguistics, Economics, Agriculture, and Applied Physics (30 from each discipline) were selected. All research article abstracts were extracted from ...

متن کامل

Lexical Bundles in English Abstracts of Research Articles Written by Iranian Scholars: Examples from Humanities

This paper investigates a special type of recurrent expressions, lexical bundles, defined as a sequence of three or more words that co-occur frequently in a particular register (Biber et al., 1999). Considering the importance of this group of multi-word sequences in academic prose, this study explores the forms and syntactic structures of three- and four-word bundles in English abstracts writte...

متن کامل

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000